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1^ , This paper studies the least upper bounds on coverage proba- 

fvq ■ bihties of the empirical likelihood ratio confidence regions based on 

estimating equations. The implications of the bounds on empirical 
l^"' . likelihood inference are also discussed. 

H 

r^ , 1. Introduction. The fact that there is a nontrivial upper bound (less 

"t^ I than one) on the coverage probability of an empirical likelihood ratio con- 

fidence region is most easily seen through that for the mean. In this case 
the confidence region is nested within the convex hull of the sample. Thus, 
regardless of its confidence level, a nontrivial upper bound on its coverage 
^ ' probability is the probability that the convex hull covers the mean. 

'nT . Several factors affect the value of the upper bound: the underlying distri- 

ly-s , bution, the sample size and the dimension of the mean. In empirical likeli- 

^O I hood inference the underlying distribution is not available. Thus, even for 

^-p ' the simple case of the mean, the upper bound on coverage probability cannot 

f— ^ . be determined. Interestingly, however, for a large class of empirical likelihood 

ratio confidence regions, including those for the mean, the least upper bound 
on the coverage probability is available. This paper studies this least upper 
bound and its implications for empirical likelihood inference. 



s 

C^ • Let m{Y,6) G 7^^ be an estimating function for Oq that is continuous in Y. 



2. Main results. To set up notation, consider a parameter of interest Oq 
of a continuous random vector Y. Let Yi,Y2, ■ ■ ■ ,Yn be re i.i.d. copies of Y. 
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2 M. TSAO 

The empirical likelihood ratio function for ^o is 

{n n n \ 

j=l i=l i=l J 

where is the origin in TZ^. See Owen (2001) and Qin and Lawless (1994). 
The log likelihood ratio l{9) is given by l{9) = —2logR{9). The empirical 
likelihood ratio confidence region for ^o is given by 

(2.2) Cr = {9\l{e)<r}, 

where r is a finite quantity determined by the desired confidence level 
through the method of calibration of choice. Throughout this paper the 
sample size n and the dimension of the estimating function k are assumed 
fixed unless we specify them to be otherwise. 

Denote hyTC{m{Yi,0Q),m(Y2,9o), . . . ,m{Yn,OQ)) the convex hull of m,(yj,^o) 
Because l(9o) is finite if and only if is in the interior of the convex hull, 
event {9QeCr} implies {0£n{m{Yi,9o),m(Y2,9o),...,m(Yn,9o))}. Thus, 

(2.3) P{9oeCr)<P[0€n{m{Yi,9o),m{Y2,9o),...MYn,9o))]. 
Further, P{9q G Cr) is a monotone increasing function of r and 

(2.4) liin P{9o£Cr) = P[Oen{m{Yi,9o),m{Y2,9o),...,m{Yn,9o))]. 

Hence, the bound in the right-hand side of (2.3) is the least upper bound on 
the coverage probability of the confidence region (2.2) associated with the 
particular m{Y, 9o). This bound, however, is in general not available because 
the distribution of m{Y, 9q) is not available. We consider instead the least 
upper bound B, 

B = sup{P(^o G Cr)} 

= sup{P[OGH(m(yi,eo),m(y2,0o),...,"i(y„,eo))]}, 

where the supremum is taken over all empirical likelihood ratio confidence 
regions based on estimating equations (2.1) and (2.2), or equivalently, all 
meaningful m(y, ^o) ^-nd ^• 

In order to find B without having to characterize the set of all meaning- 
ful m(y, ^o)) let Xi,X2,. ■ ■ ,Xn be i.i.d. copies of an arbitrary continuous 
random vector X in TZ^ and denote by 'H(Xi, X2, . . . , Xn) their convex hull. 
Consider b{k,n) given by 

(2.5) b{k,n)=snp{P[0en{Xi,X2,...,Xn)]}, 

X 

where the supremum is taken over all possible continuous random vectors 
in 7?.'^. We claim that (i) b{k,n) is attained at an X if and only if the 
distribution of its projection on the unit sphere X^ is symmetric with respect 
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to and (ii) b{k,n) is the least upper bound B. Once (i) is established, 
(ii) then follows from the fact that any m{Y, 6q) is a special case of X 
and that b{k,n) is attained at a special m{Y,9o) for, say, the empirical 
likelihood inference for the mean of the uniform distribution on the unit 
sphere in TZ^. To see the latter point, since Y is uniform on the unit sphere, 
Oq = E{Y) = and m{Y,eo) = Y -eo = Y. Hence, this m{Y,9o) and its 
projection are both symmetric with respect to 0. To prove claim (i), we 
need the following lemma. 

Lemma 1. For any continuous X in TZ , let Vi = \\Xi\\2 and without loss 
of generality assume Vi> 0. Let Xf = v^ Xi be the projection of Xi on the 
unit sphere. Then 

p{Q€n{Xi,X2,...,Xn)} = P{Qen{xf,xl...,xp)}. 

Proof. It suffices to show that ^ 7i{Xi,X2,. ■ ■ ,Xn) if and only if 
^ 7i{Xf, X2, . . . , X^). The convex hull 7i{Xi,X2, . . . , Xn) does not contain 
if and only if all Xi are on one side of a hyperplane through 0. All Xi are on 
one side of a hyperplane through if and only if their projections Xf are on 
one side of a hyperplane through 0. All Xf are on one side of a hyperplane 
if and only if their convex hull TC{Xf, Xf . . . , X^) does not contain 0. Thus 
the lemma. D 

Claim (i) implies that b{k, n) = P{0 £ TC{Ui, U2, ■ ■ ■ , Un)}, where Ui, U2, ■ ■ ■ ,Un 
are i.i.d. copies of a uniform random vector U supported on the unit sphere 
in TZ^. We now prove this claim for k = 1,2. 

Theorem 1. Let k = 1,2 and n> k. For any continuous X in TZ^ , we 
have 

(2.6) P{0eH(Xi,X2,...,X„)}<P{0GW(C/i,[/2,..., [/„)}. 

Further, equality holds if and only if the distribution of the projection of X 
on the unit sphere X^ is symmetric with respect to 0. 

Proof. By Lemma 1 we only need to show that (2.6) holds for all 
continuous X supported on the unit sphere. Thus, we assume without loss 
of generality that X is supported on the unit sphere. Under this assumption, 
the symmetry condition on X^ in Theorem 1 is equivalent to the symmetry 
condition on X itself. 

For k = 1, the unit sphere and, thus, the support of X degenerates into 
{-1,1}. Letp = P{X = l}. Then 

P{oen{x,,X2,...,Xn)} = i-p''-{i-pr. 
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Theorem 1 amounts to the simple observation that function 1 — p" — (1 — p)" 
attains its unique maximum at p = 1/2 which corresponds to the uniform 
distribution on {—1,1}, the only symmetric distribution on {—1, 1}. 

For k = 2, let X be a continuous random variable on the unit circle (0 < 
X < 2tt) and for simplicity assume that its density f{x) is continuous on the 
circle. Define 

G{x)= r^^ f{y)dy, 

J X 

where f{x) = /(27r + x). For Xi, . . . , Xj^i,Xj^i, . . . , X„, denote the event 
that they are in the half-circle {Xj,Xj + 7r) by Aj. If Xj > vr, this half-circle 
represents the union of {Xj,2Tr) and [0,Xj —it). Since Xi are i.i.d., we have 
for j = l,2,...,n 

P{A,}= rf{x)[G{x)r-Ux. 
Jo 

Further, Ai D Aj = (j) for i^ j, where (j) denotes the empty set, and 

n 

{o^n{Xi,X2,...,Xn)} = \jAi. 

i=l 

It follows that for any n > 1, 

n 

p{o^niXi,X2,...,Xn)} = Y.P{^i} 

i=l 

(2-7) 

m-l 



/O 

Noting that P{Aj} equals the probability that Xi, . . . , Xj^i,Xj^i, . . . ,X, 



n f{x)[G{x)]''-'dx. 
Jo 

ity that Xi, . . . , Xj^i, X 
are in the half-circle {Xj — vr, Xj), an equivalent expression for P{0 ^ H{Xi,X2, . . . , Xn)} 

n 

p{o^n{Xi,X2,...,Xn)} = Y.p{A,} 



i=l 

(2.8) 



27r 

n 



r f{x)[G{x-7:)r-Ux. 
Jo 

Adding up (2.7) and (2.8) gives another expression for P{0 ^ 7i{Xi,X2, ■ ■ ■ , Xn)}, 



p{o^n{Xi,X2,...,Xn)} 
(2-9) 



f ^ f{x){[G{x)r-' + [G{x-7r)r-'}dx. 
Jo 



To see that the equality in (2.6) holds if the distribution of X is symmetric 
with respect to 0, note that the distribution is symmetric if and only if 
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G{x) = 1/2 for all x G [0,27r). This and (2.7) imply that for all symmetric 
X, including U, 



(2.10) P{0 G n{Xi,X2, ..., Xn)} = 1 - n(l/2 



,n-l 



To show that the inequality in (2.6) holds strictly if the distribution of X is 
not symmetric and, thus, it also must be symmetric if the equality holds, first 
note that for any n > 1 and p G [0, 1], the function h{p) =p^~^ + (1 ~ p)^~^ 
achieves its unique minimum at p = 1/2 and this minimum is h{l/2) = 
(l/2)"-2. Since G{x),G{x - vr) > and G{x) + G(x - vr) = 1, for any n > 1, 

(2.11) (l/2)"-2 < [G{x)f-^ + [G{x - 7r)]"-^ 

If the distribution of X is not symmetric, G{x) cannot be 1/2 for all x G 
[0,27r). Further, G{x) is continuously differentiable. There exists an open 
subinterval of [0,27r) in which G{x) ^ 1/2 and G'{x) < 0. Over this subin- 
terval f{x) > and the inequality in (2.11) holds strictly. Multiply both 
sides of (2.11) by f{x) and then integrate from to 27r. We have 

(2.12) (i)"-^ < 1 r f{x){[G{x)r-' + [G{x - 7rr-'}dx, 

where the left-hand side is strictly smaller than the right-hand side because 
of the subinterval. It follows from (2.9), (2.12) and (2.10) that for an X that 
is not symmetric, 

P{^ G W(Xi, X2, . . . , Xn)} < 1 - n{l/2r-' 

= p{oen{Ui,U2,...,Un)}. a 

For fc > 3, a proof of (2.6) has eluded us so far due to difficulties in 
finding an analytic expression for P{0^7i{Xi,X2,- ■ ■ ,Xn)} for a general 
X in high dimensions. Thus, claim (i) has been proved for only k <2. We 
conjecture that claim (i) holds for all k. The rest of our discussion assumes 
this conjecture holds so that b{k,n) = P{0 G T{.{Ui,U2, ■ ■ ■ ,Un)} for all k. 
Wendel (1962) gives a formula for P{0^n{Ui,U2,...,Un)} which leads 
to the following expression for b{k,n) = P{0 G TC{Ui, U2, ■ ■ ■ , Un)}'- for any 
n> k, 

(2.i3).,M)^i-{("-X»7').....(r;)K'"-'. 

It is interesting to note that, by (2.13), when the sample size is twice as 
much as the dimension, the value of the least upper bound b{k, 2k) equals 
0.5. Theorem 2 further explores the implications of (2.13). 

Theorem 2. Denote by [x] the largest integer smaller than x. For any 
n> k, 
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(a) b{k,n + l) > b{k,n) and b{k,n) > b{k + l,n), and 

(b) for any e G (0,0.5), b{[en],n) -^ 1 and b{[{l — e)n],n) -^0 as n^ c». 

Proof. The inequalities in (a) follow easily from (2.13). To see (b) is 
true, consider the binomial random variable X ~ Bin(l/2, n — 1). Denote 
by Z the standard normal random variable. By (2.13) we have 

b{[en],n) = l-P{X < [en] - 1} 

^_ [en]-l-{n-l)/2- 



v/^r^I/2 

The right-hand side and, thus, 6([en],n) go to one when n goes to infinity. 
Similarly, 6([(1 — £)n],n) goes to zero as n goes to infinity. D 

3. Concluding remarks. The least upper bound b{k,n) may be surpris- 
ingly small when the ratio n/k is small. Table 1 shows values of the bound at 
various combinations of k and n. When this ratio is small and an empirical 
likelihood ratio confidence region of a high confidence level is desired, it is 
essential that the bound be computed to see if such a high confidence level 
is impossible. We have come across examples in the literature where regions 
of impossibly high confidence levels were computed. Practitioners need to 
be aware of the bound. 

For any fixed n, the bound b{k,n) is a strictly decreasing function of 
k. When the sample size n is not large, practitioners need to be aware of 
the negative impact of incorporating extra information about the parameter 
that will increase the dimension of the estimating equation k: (i) high con- 
fidence levels may become unachievable and (ii) continuous approximations 
to the finite sample distribution of the empirical log likelihood ratio may also 
become less accurate. The latter may diminish the benefit of incorporating 
the extra information and may, for some cases where n is not large, result 
in a loss in coverage accuracy for the empirical likelihood ratio confidence 
region [Tsao (2004)]. 

The method of empirical likelihood has been applied to some very high- 
dimensional problems and there is increasing interest in the asymptotic be- 
havior of the empirical log likelihood ratio when the sample size n and the 

Table 1 
Bounds for some combinations of n and k, r = n/k 



k 


r = 2 


r = 3 


r = 4 


r = 5 


r = 6 


r = 7 


r = 8 


1 


0.5000 


0.7500 


0.8750 


0.9375 


0.9688 


0.9844 


0.9922 


2 


0.5000 


0.8125 


0.9375 


0.9805 


0.9941 


0.9983 


0.9995 


5 


0.5000 


0.9102 


0.9904 


0.9992 


0.9999 


1.0000 


1.0000 
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dimension of the estimating equation k both tend to infinity. By Theorem 
2, when n<^k for some constant 7 € (1,2) and n goes to infinity, the dis- 
tribution of the empirical log likehhood ratio 1{6q) will degenerate into a 
point mass at infinity. There are no meaningful confidence regions of the 
form (2.2) in this case. 

On related future research problems, we note that in light of the lack 
of awareness of the bounds, a method of calibration which automatically 
respects the bounds may be helpful. Tsao (2004) contains some preliminary 
results on one such method. It may be possible to derive similar bounds for 
certain classes of empirical likelihood ratio confidence regions outside of the 
estimating equation framework (2.1) and (2.2). The conjecture that claim (i) 
holds for all k is another interesting question that we are still working on. 

To conclude, while trying to determine the value of the bound, based on 
derivations for k = 1,2 and some asymptotic observations we had communi- 
cated to several colleagues the conjecture that for any k and n> k, 

p{o^n{Ui,U2,...,Un)} 

We are indebted to Professor Qi-Man Shao who brought to our attention 
related work by J. G. Wendel, B. Efron and others. Efron (1965) appears 
to be the first to give formulae (2.7) and (2.8). Wendel (1962) has already 
noted and proved the conjecture. Citing connections to L. J. Savage, R. E. 
Machol and D. A. Darling, Wendel (1962) also gives an interesting historical 
note on the origin of the conjecture. 
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